AITopics | local estimator

Collaborating Authors

local estimator

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ParK: Sound and Efficient Kernel Ridge Regression by Feature Space Partitions

Neural Information Processing SystemsApr-25-2026, 09:56:48 GMT

We introduce ParK, a new large-scale solver for kernel ridge regression. Our approach combines partitioning with random projections and iterative optimization to reduce space and time complexity while provably maintaining the same statistical accuracy. In particular, constructing suitable partitions directly in the feature space rather than in the input space, we promote orthogonality between the local estimators, thus ensuring that key quantities such as local effective dimension and bias remain under control. We characterize the statistical-computational tradeoff of our model, and demonstrate the effectiveness of our method by numerical experiments on large-scale datasets.

artificial intelligence, estimator, machine learning, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.61)

Add feedback

32b9e74c8f60958158eba8d1fa372971-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 04:06:37 GMT

estimator, local estimator, partition, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.61)

Add feedback

Transfer learning for scalar-on-function regression via control variates

Yang, Yuping, Zhou, Zhiyang

arXiv.org Machine LearningJan-27-2026

Transfer learning (TL) has emerged as a powerful tool for improving estimation and prediction performance by leveraging information from related datasets. In this paper, we repurpose the control-variates (CVS) method for TL in the context of scalar-on-function regression. Our proposed framework relies exclusively on dataset-specific summary statistics, avoiding the need to pool subject-level data and thus remaining applicable in privacy-restricted or decentralized settings. We establish theoretical connections among several existing TL strategies and derive convergence rates for our CVS-based proposals. These rates explicitly account for the typically overlooked smoothing error and reveal how the similarity among covariance functions across datasets influences convergence behavior. Numerical studies support the theoretical findings and demonstrate that the proposed methods achieve competitive estimation and prediction performance compared with existing alternatives.

artificial intelligence, machine learning, var 1, (18 more...)

arXiv.org Machine Learning

2601.17217

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Banking & Finance > Trading (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

One-shot Robust Federated Learning of Independent Component Analysis

Jin, Dian, Bing, Xin, Zhang, Yuqian

arXiv.org Machine LearningMay-28-2025

This paper investigates a general robust one-shot aggregation framework for distributed and federated Independent Component Analysis (ICA) problem. We propose a geometric median-based aggregation algorithm that leverages $k$-means clustering to resolve the permutation ambiguity in local client estimations. Our method first performs k-means to partition client-provided estimators into clusters and then aggregates estimators within each cluster using the geometric median. This approach provably remains effective even in highly heterogeneous scenarios where at most half of the clients can observe only a minimal number of samples. The key theoretical contribution lies in the combined analysis of the geometric median's error bound-aided by sample quantiles-and the maximum misclustering rates of the aforementioned solution of $k$-means. The effectiveness of the proposed approach is further supported by simulation studies conducted under various heterogeneous settings.

artificial intelligence, estimator, machine learning, (16 more...)

arXiv.org Machine Learning

2505.20532

Country: North America > Canada > Ontario > Toronto (0.14)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Distributed Learning of Mixtures of Experts

Chamroukhi, Faïcel, Pham, Nhat Thien

arXiv.org Machine LearningDec-15-2023

In modern machine learning problems one has to deal with datasets that are not centralized. This may be related to the application context in which the data can be by nature available at different locations and not accessible in a centralized mode, or distributed for computational issues in case of a large amount of data. Indeed, even if the dataset is fully available in a centralized mode, implementing reasonable learning algorithms may be computationally demanding in case of a large number of examples. The construction of distributed techniques in a Federated Learning setting Yang et al. (2019) in which the model is trained collaboratively under the orchestration of a central server, while keeping the data decentralized, is an increasing area of research. The most attractive strategy is to perform standard inference on local machines to obtain local estimators, then transmits them to a central machine where they are aggregated to produce an overall estimator, while attempting to satisfy some statistical guarantees criteria. There are many successful attempts in this direction of parallelizing the existing learning algorithms and statistical methods. Those that may be mentioned here include, among others, parallelizing stochastic gradient descent (Zinkevich et al., 2010), multiple linear regression (Mingxian et al., 1991), parallel K-means in clustering based on MapReduce (Zhao et al., 2009), distributed learning for heterogeneous data via model integration (Merugu and Ghosh, 2005), split-and-conquer approach for penalized regressions (Chen and ge Xie, 2014), for logistic regression (Shofiyah and Sofro, 2018), for k-clustering with heavy noise Li and Guo (2018). It is only very recently that a distributed learning approach has been proposed for mixture distributions, specifically for finite Gaussian mixtures (Zhang and Chen, 2022a). In this paper we focus on mixtures of experts (MoE) models (Jacobs et al., 1991; Jordan and Xu, 1995) which extend the standard unconditional mixture distributions that are typically used for clustering purposes, to model complex non-linear relationships of a response Y conditionally on some predictors X, for prediction purposes, while enjoying denseness results, e.g.

artificial intelligence, estimator, machine learning, (18 more...)

arXiv.org Machine Learning

2312.09877

Country:

Asia > Middle East > Jordan (0.24)
Europe > France (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

Variational Gradient Descent using Local Linear Models

Liu, Song, Simons, Jack, Yi, Mingxuan, Beaumont, Mark

arXiv.org Artificial IntelligenceMay-24-2023

Stein Variational Gradient Descent (SVGD) can transport particles along trajectories that reduce the KL divergence between the target and particle distribution but requires the target score function to compute the update. We introduce a new perspective on SVGD that views it as a local estimator of the reversed KL gradient flow. This perspective inspires us to propose new estimators that use local linear models to achieve the same purpose. The proposed estimators can be computed using only samples from the target and particle distribution without needing the target score function. Our proposed variational gradient estimators utilize local linear models, resulting in computational simplicity while maintaining effectiveness comparable to SVGD in terms of estimation biases. Additionally, we demonstrate that under a mild assumption, the estimation of high-dimensional gradient flow can be translated into a lower-dimensional estimation problem, leading to improved estimation accuracy. We validate our claims with experiments on both simulated and real-world datasets.

artificial intelligence, estimator, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2305.15577

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Heterogeneous Federated Learning on a Graph

Wang, Huiyuan, Zhao, Xuyang, Lin, Wei

arXiv.org Artificial IntelligenceSep-18-2022

Federated learning, where algorithms are trained across multiple decentralized devices without sharing local data, is increasingly popular in distributed machine learning practice. Typically, a graph structure $G$ exists behind local devices for communication. In this work, we consider parameter estimation in federated learning with data distribution and communication heterogeneity, as well as limited computational capacity of local devices. We encode the distribution heterogeneity by parametrizing distributions on local devices with a set of distinct $p$-dimensional vectors. We then propose to jointly estimate parameters of all devices under the $M$-estimation framework with the fused Lasso regularization, encouraging an equal estimate of parameters on connected devices in $G$. We provide a general result for our estimator depending on $G$, which can be further calibrated to obtain convergence rates for various specific problem setups. Surprisingly, our estimator attains the optimal rate under certain graph fidelity condition on $G$, as if we could aggregate all samples sharing the same distribution. If the graph fidelity condition is not met, we propose an edge selection procedure via multiple testing to ensure the optimality. To ease the burden of local computation, a decentralized stochastic version of ADMM is provided, with convergence rate $O(T^{-1}\log T)$ where $T$ denotes the number of iterations. We highlight that, our algorithm transmits only parameters along edges of $G$ at each iteration, without requiring a central machine, which preserves privacy. We further extend it to the case where devices are randomly inaccessible during the training process, with a similar algorithmic convergence guarantee. The computational and statistical efficiency of our method is evidenced by simulation experiments and the 2020 US presidential election data set.

artificial intelligence, estimator, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2209.08737

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
Europe > Spain > Galicia > Madrid (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Government > Voting & Elections (0.68)
Government > Regional Government > North America Government > United States Government (0.66)
Information Technology > Security & Privacy (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Distributed Learning with Random Features

Li, Jian, Liu, Yong, Wang, Weiping

arXiv.org Machine LearningJun-9-2019

Distributed learning and random projections are the most common techniques in large scale nonparametric statistical learning. In this paper, we study the generalization properties of kernel ridge regression using both distributed methods and random features. Theoretical analysis shows the combination remarkably reduces computational cost while preserving the optimal generalization accuracy under standard assumptions. In a benign case, $\mathcal{O}(\sqrt{N})$ partitions and $\mathcal{O}(\sqrt{N})$ random features are sufficient to achieve $\mathcal{O}(1/N)$ learning rate, where $N$ is the labeled sample size. Further, we derive more refined results by using additional unlabeled data to enlarge the number of partitions and by generating features in a data-dependent way to reduce the number of random features.

artificial intelligence, machine learning, random feature, (16 more...)

arXiv.org Machine Learning

1906.03155

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Optimal Convergence for Distributed Learning with Stochastic Gradient Methods and Spectral-Regularization Algorithms

Lin, Junhong, Cevher, Volkan

arXiv.org Machine LearningJan-22-2018

We study generalization properties of distributed algorithms in the setting of nonparametric regression over a reproducing kernel Hilbert space (RKHS). We first investigate distributed stochastic gradient methods (SGM), with mini-batches and multi-passes over the data. We show that optimal generalization error bounds can be retained for distributed SGM provided that the partition level is not too large. We then extend our results to spectral-regularization algorithms (SRA), including kernel ridge regression (KRR), kernel principal component analysis, and gradient methods. Our results are superior to the state-of-the-art theory. Particularly, our results show that distributed SGM has a smaller theoretical computational complexity, compared with distributed KRR and classic SGM. Moreover, even for non-distributed SRA, they provide the first optimal, capacity-dependent convergence rates, considering the case that the regression function may not be in the RKHS.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1801.07226

Country: Europe (0.45)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.70)

Add feedback

Distributed Parameter Estimation via Pseudo-likelihood

Liu, Qiang, Ihler, Alexander

arXiv.org Machine LearningJun-27-2012

Estimating statistical models within sensor networks requires distributed algorithms, in which both data and computation are distributed across the nodes of the network. We propose a general approach for distributed learning based on combining local estimators defined by pseudo-likelihood components, encompassing a number of combination methods, and provide both theoretical and experimental analysis. We show that simple linear combination or max-voting methods, when combined with second-order information, are statistically competitive with more advanced and costly joint optimization. Our algorithms have many attractive properties including low communication and computational cost and "any-time" behavior.

artificial intelligence, estimator, machine learning, (16 more...)

arXiv.org Machine Learning

1206.642

Country: North America > United States > California (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Communications > Networks (0.89)

Add feedback